Analysis method for increasing susceptibility to sorafenib treatment in hepatocellular carcinoma

ABSTRACT

Provided is an analytical method of providing an information for diagnosis of a hepatocellular carcinoma patient having susceptibility to sorafenib, wherein the expression levels of eight genes, i.e., CDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes, in combination, are measured in carcinoma tissues of a hepatocellular carcinoma patient, and thereby the response (susceptibility) to sorafenib can be predicted with high accuracy, and the combination of said genes can be usefully applied as a biomarker for selecting a hepatocellular carcinoma patient having susceptibility to sorafenib.

TECHNICAL FIELD

The present invention relates to an analytical method for increasingsusceptibility of sorafenib therapy in hepatocellular carcinoma. Morespecifically, the present invention relates to an analytical method ofproviding an information for diagnosis of a hepatocellular carcinomapatient having susceptibility to sorafenib.

BACKGROUND ART

Liver cancer is one of the major fatal cancers with high mortality andhigh incidence. In liver cancer cases, hepatocellular carcinoma (HCC)takes the major portions. During the therapeutic course, up to 50% ofHCC patients received systemic therapy. In systemic therapy for HCC,sorafenib and lenvatinib were approved as first-line treatment withoutpredictive biomarkers. However, sorafenib revealed 2% of objectiveresponse rates and 10.7 months of median overall survival (Llovet J M,et al., Sorafenib in advanced hepatocellular carcinoma. N Engl J Med2008; 359: 378-390). This absence of predictive biomarkers causednon-selective treatment and showed poor response rates and overallsurvival. Also, most of agents for HCC lack predictive biomarkers. Manyresearchers have proposed that efforts to identify and validate newpredictive biomarkers should be continued to be helpful to predictclinical efficacy and resistance to these agents (Califf R M. Biomarkerdefinitions and their applications. Exp Biol Med (Maywood) 2018;243:213-221; Twomey J D, Brahme N N, Zhang B. Drug-biomarkerco-development in oncology—20 years and counting. Drug Resist Updat2017; 30: 48-62).

Biomarkers are used for various purposes. Depending on the use,biomarkers are classified to a diagnostic biomarker, a predictivebiomarker, a prognostic biomarker etc. Among them, a predictivebiomarker has the function that it predicts a favorable or unfavorableeffect for therapeutic intervention (Bhattacharyya A, Rai S N. AdaptiveSignature Design review of the biomarker guided adaptive phase-IIIcontrolled design. Contemp Clin Trials Commun 2019; 15: 100378). In thistherapeutic intervention degree, there are a disease control rate (DCR)which means rate of non-progression and an objective response rate (ORR)which means rate of traditional tumor response (Lara P N, Jr., et al.,Disease control rate at 8 weeks predicts clinical benefit in advancednon-small-cell lung cancer results from Southwest Oncology Grouprandomized trials. J Clin Oncol 2008; 26: 463-467). The classificationfavorable group and unfavorable group for therapeutic intervention,through using predictive biomarkers, has important roles in clinicaltrials and clinical applications. Thus many researchers have beenstudied a predictive biomarker for various agents in most of diseases.The provision of right therapies to right patients is expected to beable to improve the quality of life by preventing unnecessary therapies.

DISCLOSURE Technical Problem

The present inventors have performed various studies to developclinically applicable biomarkers capable of predicting the diseasecontrol of sorafenib. Especially, the present inventors combinedweighted genes to DCR gene signature and validated with variousstatistical analyses and meta-analyses. As a result, it has been foundthat, when analyzed in combination of the expression levels of specificgenes, i.e., CDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARGgenes, the response (also referred to as “susceptibility”) to sorafenibtreatment can be predicted with high accuracy.

Therefore, it is an object of the present invention to provide ananalytical method for predicting a response to sorafenib treatment in ahepatocellular carcinoma patient, the method of which comprises usingCDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes asbiomarkers.

Technical Solution

In accordance with an aspect of the present invention, there is providedan analytical method of providing an information for diagnosis of ahepatocellular carcinoma patient having susceptibility to sorafenib, themethod of which comprises measuring expression levels of CDH1, CHAD,EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes, in carcinoma tissuesamples which are externally discharged from the hepatocellularcarcinoma patient, respectively.

In the analytical method of the present invention, the measuring may becarried out by measuring the mRNA expression levels of CDH1, CHAD,EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes, respectively.

Advantageous Effects

It has been found by the present invention that, when analyzed incombination of the expression levels of specific genes, i.e., CDH1,CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes, inhepatocellular carcinoma tissues of a hepatocellular carcinoma patient,the response (susceptibility) to sorafenib treatment can be predictedwith high accuracy. Therefore, the combination of said genes can beusefully applied as biomarkers for selecting a hepatocellular carcinomapatient having susceptibility to sorafenib.

DESCRIPTION OF DRAWINGS

FIG. 1 shows a flowchart of gene signature development.

FIG. 2 shows the results obtained by evaluating clinical performance ofthe 8-gene signatures in predicting the disease control of sorafenib.FIG. 2A is the result of ROC (Receiver operating characteristic)analysis and FIG. 2B is the results of cross validation and logisticregression analyses.

FIG. 3 shows the results obtained by evaluating overall survival andprogression free survival in predicted good responders versus predictedpoor responders. FIG. 3A is the Kaplan-Meier curves for overall survivaland FIG. 3B is the KM curves for progression free survival.

BEST MODE FOR CARRYING OUT THE INVENTION

As used herein, the term “sorafenib” refers to the compound of thefollowing Formula 1, including its pharmaceutically acceptable salt, forexample p-toluenesulfonate salt.

The term “patient having susceptibility to sorafenib” refers to ahepatocellular carcinoma patient showing response according to thesorafenib administration (i.e., tumor response). The “tumor response”refers to complete response, partial response, or stable diseaseaccording to the RECIST (Response Evaluation Criteria in Solid Tumors)defined in Llovet J M, et al. (2008) Sorafenib in advancedhepatocellular carcinoma. N Engl J Med 359: 378-390.

The term “hepatocellular carcinoma tissues” and “normal tissues” referto the tissues samples externally discharged, via e.g., biopsy, from thehepatocellular carcinoma tissues and the surrounding normal tissuesderived from a hepatocellular carcinoma patient. In clinics, carcinomatissues and surrounding normal tissues are generally collected from apatient and then tissue examinations thereof are carried out fordiagnosing hepatocellular carcinoma and/or establishing therapeuticregimen. Therefore, the term “hepatocellular carcinoma tissues” and“normal tissues” refer to the tissues samples externally discharged froma patient, e.g., for tissue examination in clinics.

Because of predictive biomarkers absence, sorafenib has poor responserates and overall survival period in hepatocellular carcinoma (HCC)therapy. The predictive biomarker could be a method that potentiallyimproves the effectiveness of sorafenib. The present inventors haveperformed various studies to develop a clinically useful biomarker thatpredicts disease control of sorafenib. Using nCounter (NanostringTechnologies, Seattle, Wash.), the present inventors analyzed expressionlevels of 770 genes in 73 HCC patients who had received sorafenibtreatment. As a result, we identified differentially expressed genes(DEGs) and computed combination of weighted gene expression forpredictive biomarker. To validate gene signature, we analyzed crossvalidation and meta-analysis. As the results thereof, the 8-genesignature showed 0.90 of area under the curves (AUC), 91.78% ofaccuracy. In cross validation, the 8-gene signature showed 83.67% ofcross validation accuracy. Also, the classification with the 8-genesignature revealed that median overall survival (median OS) was improvedto 27.3 months from 11.3 months. Therefore, the 8-gene signatureprovides a best compromise between sorafenib effectiveness and coverageof sorafenib treatment patients.

Therefore, the present invention provides an analytical method ofproviding an information for diagnosis of a hepatocellular carcinomapatient having susceptibility to sorafenib, the method of whichcomprises measuring expression levels of CDH1, CHAD, EFNA2, FANCC,MAP2K1, MEN1, PBRM1, and PPARG genes, in carcinoma tissue samples whichare externally discharged from the hepatocellular carcinoma patient,respectively.

In the analytical method of the present invention, the measuring may becarried out by measuring the mRNA expression levels of CDH1, CHAD,EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes, respectively. ThemRNA expression levels may be measured according to conventional methodsused in the field of biotechnology, for example, by using nCounterPanCancer Pathway Panel (Nanostring Technologies, Seattle, Wash.).

In the analytical method of the present invention, the eight genes usedas biomarkers, i.e., CDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, andPPARG genes, are known in the art and the sequences thereof are known inGenBank and the like. For example, the NCBI Accession Numbers of theCDH1 (cadherin1) protein are NP_001304113, NP_001304114, NP_001304115,NP_004351 and the like; and the NCBI Accession Numbers of the genesencoding the same (mRNAs) are NM_004360, NM_001317184, NM_001317185,NM_001317186 and the like. The NCBI Accession Number of the CHAD(chondroadherin) protein is NP_001258; and the NCBI Accession Number ofthe gene encoding the same (mRNA) is NM_001267. The NCBI AccessionNumber of the EFNA2 (ephrin A2) protein is NP_001396; and the NCBIAccession Number of the gene encoding the same (mRNA) is NM_001405. TheNCBI Accession Numbers of the FANCC (FA complementation group C) proteinare NP_000127, NP_001230672 and the like; and the NCBI Accession Numbersof the genes encoding 3 the same (mRNAs) are NM_000136, NM_001243743 andthe like. The NCBI Accession Number of the MAP2K1 (mitogen-activatedprotein kinase kinase 1) protein is NP_002746; and the NCBI AccessionNumber of the gene encoding the same (mRNA) is NM_002755. The NCBIAccession Numbers of the MEN1 (menin 1) protein are NP_000235,NP_570711, NP_570712, NP_570713, NP_570714 and the like; and the NCBIAccession Numbers of the genes encoding the same (mRNAs) are NM_000244,NM_130799, NM_130800, NM_130801, NM_130802 and the like. The NCBIAccession Numbers of the PBRM1 (polybromo 1) protein are NP_060783,NP_851385, NP_001337003, NP_001337004, NP_001337005 and the like; andthe NCBI Accession Numbers of the genes encoding the same (mRNAs) areNM_018165, NM_018313, NM_181041, NM_181042, NM_001350074 and the like.The NCBI Accession Numbers of the PPARG (peroxisome proliferatoractivated receptor gamma) protein are NP_001317544, NP_005028,NP_056953, NP_619725, NP_619726 and the like; and the NCBI AccessionNumbers of the genes encoding the same (mRNAs) are NM_005037, NM_015869,NM_138711, NM_138712, NM_001330615 and the like.

In an embodiment of the analytical method according to the presentinvention, after measuring the expression levels of CDH1, CHAD, EFNA2,FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes in carcinoma tissue samplesexternally discharged from a hepatocellular carcinoma patient; when theTBPS (Treatment Benefit Prediction Score) value calculated according tothe following equation is greater than −2.483069, the patient can beclassified to a patient who exhibits a response to sorafenib treatment(i.e., a patient who exhibits susceptibility to sorafenib treatment) andwhen the TBPS value is −2.483069 or less, the patient can be classifiedto a patient who does not exhibit a response to sorafenib treatment(i.e., a patient who does not exhibit susceptibility to sorafenibtreatment).

TBPS=(−0.000225)*G _(CDH1)+(0.001787)*G _(CHAD)+(−0.005687)*G_(EFNA2)+(−0.002104)*G _(FANCC)+(−0.001009)*G _(MAP2K1)+(0.002101)*G_(MEN1)+(−0.001336)*G _(PBRM1)+(0.001710)*G _(PPARG)

In the above equation, G_(CDH1), G_(CHAD), G_(EFNA2), G_(FANCC),G_(MAP2K1), G_(MEN1), G_(PBRM1), and G_(PPARG) represent the geneexpression levels of CDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, andPPARG genes, respectively. That is, the expression level of each generepresents a normalized expression level obtained by using nCounter(Nanostring Technologies, Seattle, Wash.), a gene expression measuringdevice. The normalization is performed according to the manufacturersprotocol, using nSolver Analysis Software v 3.0 (NanostringTechnologies), which is provided to nCounter (Nanostring Technologies,Seattle, Wash.).

The present invention will be described in further detail with referenceto the following examples. These examples are for illustrative purposesonly and are not intended to limit the scope of the present invention.

1. Test Methods

(1) Patients and Tissue Samples

This study included 73 histological confirmed HCC patients. HCC tissueswere collected from the 73 patients before the sorafenib treatment. Alltissues were obtained by needle biopsy. The patients were from AjouMedical Center (AMC). The protocol of this study was approved by theInstitutional Review Boards of Ajou Medical Center (AMC).

The HCC tissue samples were snap-frozen in liquid nitrogen and stored at−80° C. Complete clinical information was available for all cases.Patient staging information was obtained from CT or MRI images, and theBarcelona Clinic Liver Cancer (BCLC) staging was used.

(2) Measurement of Clinical Outcomes

Tumor response was assessed by computed tomography (CT) or magneticresonance imaging (MRI) at 3 months and 6 months after administration ofsorafenib based upon the modified Response Evaluation Criteria in SolidTumors for HCC (mRECIST). From a DCR perspective, patients with completeresponse (CR), partial response (PR) and stable disease (SD) wereconsidered as responders whereas those with progressive disease (PD)were judged as non-responders.

(3) RNA Extraction

Total RNA was extracted from both tumor and non-tumor tissues using theRNeasy mini kit (Qiagen, Hilden, Germany) with DNase I treatment(Qiagen, Hilden, Germany). Total RNA integrity was verified using aBioanalyzer 2100 (Agilent Technologies, Santa Clara, Calif., USA). TotalRNA concentration was measured using a Nanodrop 2000 (Thermo Fisherscientific, Waltham, Miss., USA).

(3) Gene Expression Assay

Gene expressions were analyzed by nCounter MAX (Nanostring,Technologies, Seattle, Wash., USA). Total reaction volume was 15 ul thatcontain 100 ng of RNA, reporter probes and capture probes. 770 genes(including 40 control genes) were analyzed through nCounter PanCancerPathway Panel (Nanostring, Technologies, Seattle, Wash., USA). Qualitycontrol and normalization of the raw data was performed using nSolverAnalysis Software v 4.0 (Nanostring Technologies, Technologies, Seattle,Wash., USA).

(4) Combination Gene Analysis

Differentially expressed genes (DEGs) were screened to meet one of thefollowing conditions: 1) statistically significant difference comparingtumors with non-tumors or 2) statistically significant differencecomparing sorafenib treatment responders with non-responders. The priorscreened DEGs were further shortlisted through the logistic regressionanalysis for sorafenib response. Before combination of DEGs, weidentified logistic regression coefficient of each genes and weightedgene expression with the corresponding coefficient value. The genesignature is calculated using the following equation:

Σ(correlation regression coefficient×gene expression value)

The number of shortlisted DEGs was analyzed in combination and the totalnumber of gene combinations was calculated using the following equation:

$\sum\limits_{k = 1}^{n}\frac{n!}{{K!}{\left( {n - k} \right)!}}$

In the above equation, n is the total number of shortlisted DEGs and kis the number of genes included in the combinations.

(5) Validation of candidate gene signatures by cross validation

The candidate gene signatures (p<0.05, AUC>0.08, sensitivity>85% andspecificity>85%) were ranked by k-fold cross validation to identify theoptimal gene combination. The patients were randomly separated by 2folds (training set and test set), which were tested for 300 timesrepeatedly.

(6) Signal Transduction Pathway Analysis Based on Meta-Analysis

Signal transduction analysis was performed using meta-analysis program,CBS Probe PINGS™ (CbsBioscience, Daejeon, KOR) that consists of 5modules (PPI module, Path-Finder module, Path-Linker module, Path-makermodule and Path-Lister module). For gene signature validation, signaltransduction was analyzed for pathway related with each patients DEGscompared tumor and non-tumor; and pathway related with gene signature.The genes were mapped to the signal transduction pathways obtained fromKEGG (Kyoto Encyclopedia of Genes and Genomes database). We selectedeach top 10 signal transduction pathways for each patient's DEGs andgene signature according to the weight of numbers of interactions andinteracting genes. And, we compared total pathway related with eachpatients DEGs and gene signature related pathway. Further we obtainedinteracting genes which mapped to the signal transduction pathways viaKEGG (Kyoto Encyclopedia of Genes and Genomes database).

(7) Statistical Analysis

Relationship between treatment response and clinicopathologic variableswas evaluated using Chi square tests or Fishers exact tests. Geneexpression data were tested for normality with the Shapiro-Wilk test. Asthe total 73 data meet normality assumptions, significant differencesbetween tumors and non-tumors were evaluated using the student t-test.As the data did not meet normality assumptions, significant differencesbetween responders and non-responders were evaluated using the wilcoxontest. Receiver operating characteristic (ROC) curve analysis was used todetermine accuracy of threshold values classifying tumor responders andnon-responders using gene signature. Kaplan-Meier survival (KM) curveswere calculated using death as end point in overall survival (OS) andusing death and progression disease as end point in progression freesurvival (PFS). The difference in KM curves was examined by log-ranktest and the difference in hazard ratio was examined by Cox regressionanalysis. Candidate Gene signatures were analyzed using logisticregression to measure the relationships between response to sorafenibtreatment, classification, and clinicopathologic variables. Significancewas set at P<0.05 (two-tailed). All statistics were performed in Rversion 3.3.3 (R Development Core Team, https:/www.r-project.org/).

2. Test Results

(1) Clinicopathologic characteristics of patients

In 73 patients treated with sorafenib, responders were 21 andnon-responders were 52. There were no statistical differences betweenresponders and non-responders in gender, HBV, HCV, TMN stage, and BCLCstage. However, in responder groups, patients whom age was over than 55years were large portions. Patients whom AFP (alpha fetoprotein) wasunder 100 ng/ml were large portions in responders group, reverse tonon-responders group (Table 1).

TABLE 1 Clinicopathologic Responders Non-responders parameters (n = 21)(n = 52) p-value* Age (range) <55years 3 (14.3%) 23 (44.2%) 0.0317≥55years 18 (85.7%)  29 (55.8%) Gender (M/F) Male 16 (76.2%)  41 (78.8%)0.7657 Female 5 (23.8%) 11 (21.2%) HBV (hepatitis B virus) (−1) Absent 3(15.0%)  7 (13.5%) 1 Present 17 (85.0%)  45 (86.5%) HCV (hepatitis Cvirus) (−1) Absent 20 (100.0%) 49 (94.2%) 0.5553 Present 0 (0.0%)  3(5.8%) BCLC stage A 1 (4.8%)  1 (1.9%) 0.3493 B 2 (9.5%)  13 (25.0%) C18 (85.7%)  37 (71.2%) D 0 (0.0%)  1 (1.9%) AFP (−1) (−7) <100 ng/ml 13(65.0%)  15 (33.3%) 0.0350 ≥100 ng/ml 7 (35.0%) 30 (66.7%) Tumorresponse Complete response 2 (9.5%)  0 (0.0%) Partial response 9 (42.9%)0 (0.0%) Stable disease 10 (47.6%)  0 (0.0%) Progressive disease 0(0.0%)   52 (100.0%) *p values were calcuated using the Fisher's exacttest.

(2) Differential Expression Gene Analyses of Tumors Versus Non-Tumors,and Responders Versus Non-Responders

DEG analysis between tumors and non-tumors revealed 507 out of 730 geneswere significantly differentially expressed between tumors andnon-tumors. And, DEG analysis between responders and non-responders tosorafenib treatment revealed 49 out of 730 genes were significantlydifferentially expressed between responders and non-responders tosorafenib treatment. The total numbers of genes that meet screeningconditions were 525 genes (including 31 overlapping genes). Whenlogistic regression analysis was performed with prior screened genes,the analysis revealed 26 DEGs were statistically significant (Table 2).

TABLE 2 Sorafenib response correlated genes in DEGs Screen conditionLogistic Logistic Screen condition 1 Responder vs regression regressionT mean NT mean Fold T vs NT Non-responder No. Gene coefficient p-value(n = 81) (n = 51) change p-value* p-value¶ 1 AR 3.58E−02 1707.42 5590.40−3.51 2.20E−16 — 2 CD14 1.91E−02 4686.37 15061.60 −3.21 1.94E−12 — 3CDC148 5.68E−03 721.22 1583.62 −2.20 1.88E−15 — 4 CDH1 −0.0002251.92E−02 5253.03 8897.98 −1.71 7.36E−10 2.09E−02 5 CHAD 0.0017872.19E−02 354.29 752.81 −2.12 4.69E−08 — 6 EFNA2 −0.005687 1.53E−02212.88 297.39 −1.4 4.76E−03 1.46E−02 7 FANCC −0.002104 4.15E−02 714.64529.24 1.35 4.13E−05 3.66E−02 8 FANCL 9.24E−03 724.58 820.53 −1.131.66E−02 9.28E−03 9 IL1R1 3.70E−02 769.27 1708.03 −2.22 3.11E−10 — 10KAT2B 3.33E−02 1263.95 1911.99 −1.51 9.41E−09 — 11 LIG4 4.80E−02 698.011003.63 −1.44 2.96E−09 — 12 MAP2K1 −0.001009 3.65E−02 1984.00 5323.86−2.68 6.14E−15 2.46E−02 13 PBRM1 −0.001336 3.41E−02 1664.14 1838.44 −1.11.03E−02 6.22E−03 14 PIK3CB 3.43E−02 684.01 610.41 1.12 2.51E−022.78E−02 15 PPARG 0.001710 1.71E−02 674.21 508.69 1.33 5.67E−04 3.05E−0216 PPP2R1A 4.38E−02 5679.31 4752.90 1.19 5.29E−04 — 17 PRKCA 4.41E−02852.48 633.07 1.35 1.29E−04 — 18 RAD21 4.47E−02 6629.89 4231.07 1.571.93E−08 1.51E−02 19 RFC4 4.74E−02 1269.50 373.64 3.4 4.60E−16 1.78E−0220 SOCS1 3.35E−02 284.01 589.12 −2.07 3.29E−03 1.15E−02 21 TNFSF101.74E−02 4217.80 8154.5 −1.93 6.14E−09 3.45E−02 22 ACVR1B 2.76E−02 — — —— 3.00E−03 23 FGF2 4.53E−02 — — — — 8.33E−03 24 GTF2H3 3.67E−02 — — — —1.41E−02 25 MEN1 0.002101 2.10E−02 — — — — 3.55E−02 26 POLB 3.97E−02 — —— — 4.37E−02

Data that do not meet screening conditions were not shown. Logisticregression coefficients of genes composing 8-gene signature were shown.

* p values were calculated using the student t-test.

¶ p values were calculated using the wilcoxon test.

(3) Gene Combination Analysis and Candidate Gene Signatures

Top 5 candidate gene signatures were ranked with AUC, and their AUC,sensitivity, and specificity are shown in Table 3 below.

TABLE 3 No. of Logisitic Combi- Regression nation Sensi- Speci- RankGene Signature p-value genes AUC tivity ficity 1CD14_CDH1_EFNA2_LIG4_MEN1_PBRM1_POLB_PPARG_TNFSF10 1.40E−07 9 0.92085.71 92.31 2 AR_CDH1_EFNA2_FANCC_MAP2K1_MEN1_PBRM1 _PPARG 1.40E−07 80.906 85.71 92.31 3 CDH1_EFNA2_FANCC_MEN1_PBRM1_PPARG_RFC4_SOCS11.80E−07 8 0.902 90.48 92.31 4CDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG 1.03E−07 8 0.895 85.7194.23 5 CDH1_EFNA2_FANCC_KAT2B_MAP2K1_MEN1_PBRM1_PPARG 1.03E−07 8 0.88885.71 94.23

(4) Gene Signature Selection with Cross Validation

Top 5 candidate gene signatures were validated with cross validation.Rank 1 gene signature(CD14_CDH1_EFNA2_LIG4_MEN1_PBRM1_POLB_PPARG_TNFSF10) showed crossvalidation accuracy=82.00. Rank 2 gene signature(AR_CDH1_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG) showed cross validationaccuracy=82.00. Rank 3 gene signature(CDH1_EFNA2_FANCC_MEN1_PBRM1_PPARG_RFC4_SOCS1) showed cross validationaccuracy=80.33. Rank 4 gene signature(CDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG) showed cross validationaccuracy=83.67 (FIG. 2 ). Rank 5 gene signature(CDH1_EFNA2_FANCC_KAT2B_MAP2K1_MEN1_PBRM1_PPARG) showed cross validationaccuracy=81.33. With the cross validation accuracy, we selected Rank 4gene signature (CDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG) as genesignature for predicting sorafenib response.

(5) Calculation of Treatment Benefit Prediction Score (TBPS) throughLogistic Regression Analysis

The regression coefficient values for each gene obtained through theunivariate logistic regression analysis on the combination of the eightgenes selected by the above method (i.e., CDH1, CHAD, EFNA2, FANCC,MAP2K1, MEN1, PBRM1, and PPARG) are shown in the following Table 4.

TABLE 4 Gene Regression coefficient value CDH1 −0.000225 CHAD 0.001787EFNA2 −0.005687 FANCC −0.002104 MAP2K1 −0.001009 MEN1 0.002101 PBRM1−0.001336 PPARG 0.001710

A Treatment Benefit Prediction Score (TBPS) was calculated according tothe following equation, using the normalized expression levels of therespective 8 genes (i.e., CDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1,and PPARG) obtained with nCounter (Nanostring Technologies, Seattle,Wash.) and the regression coefficient values for each gene.

TBPS=C _(CDH1) *G _(CDH1) +C _(CHAD) *G _(CHAD) +C _(EFNA2) *G _(EFNA2)+C _(FANCC) *G _(FANCC) +C _(MAP2K1) *G _(MAP2K1) +C _(MEN1) *G _(MEN1)+C _(PBRM1) *G _(PBRM1) +C _(PPARG) *G _(PPARG)

In the above equation, C_(gene) represents the regression coefficientvalue of the corresponding gene; and G_(gene) represents the normalizedexpression level of the corresponding gene which was obtained withnCounter (Nanostring Technologies, Seattle, Wash.). Thus, from theresults of Table 4, the TBPS can be also calculated according to thefollowing equation.

TBPS=(−0.000225)*G _(CDH1)+(0.001787)*G _(CHAD)+(−0.005687)*G_(EFNA2)+(−0.002104)*G _(FANCC)+(−0.001009)*G _(MAP2K1)+(0.002101)*G_(MEN1)+(−0.001336)*G _(PBRM1)+(0.001710)*G _(PPARG)

The calculated TBPS value as described above is −2.483069, which can beused as a threshold capable of predicting a response to sorafenibtherapy. That is, after measuring the expression levels of CDH1, CHAD,EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes in a hepatocellularcarcinoma patient; when the TBPS value calculated according to the aboveequation is greater than −2.483069, the patient can be classified to apatient who exhibits a response to sorafenib treatment (i.e., a patientwho exhibits susceptibility to sorafenib treatment) and when the TBPSvalue is −2.483069 or less, the patient can be classified to a patientwho does not exhibit a response to sorafenib treatment (i.e., a patientwho does not exhibit susceptibility to sorafenib treatment).

(6) Kaplan-Meier Analysis about Prognosis Between 8-Gene Signature HighGroup and Low Group

Using the KM analysis, prognosis between predicted good responders andpredicted poor responders was investigated in overall survival andprogression free survival. In overall survival, median overall survivalof entire patients was 11.3 months (95% CI; 4.6-11.2), median overallsurvival of predicted good responders was 27.3 months (95% CI;11.3-28.5) and median overall survival of predicted poor responders was6.7 months (95% CI; 3.6-6.8). Hazard ratio between predicted goodresponders and predicted poor responders was 0.27 (95% CI; 0.13-0.59,p-value=0.0005). In progression free survival, median survival of entirepatients was 2.9 months (95% CI; 2.8-3.4), median survival of predictedgood responders was 5.8 months (95% CI; 3.9-8.4) and median survival ofpredicted poor responders was 2.8 months (95% CI; 2.7-3.0). Hazard ratiobetween predicted good responders and predicted poor responders was 0.21(95% CI; 0.11-0.42, p-value<0.0001). In time to progression, medianperiod of entire patients was 2.9 months (95% CI; 2.8-3.4), medianperiod of predicted good responders was 5.8 months (95% CI; 3.9-8.4) andmedian period of predicted poor responders was 2.8 months (95% CI;2.7-3.0). Hazard ratio between predicted good responders and predictedpoor responders was 0.22 (95% CI; 0.11-0.44, p-value<0.0001) (FIG. 3 ).

TABLE 5 Predicted good Predicted poor responders to responders toSorafenib sorafenib sorafenib treatment treatment treatment Median 11.3months 27.3 months 6.7 months overall survival (4.6-11.2 months)(11.3-28.5 months) (3.6-6.8 months) Median 2.9 months 5.8 months 2.8months progression-free (2.8-3.4 months) (3.9-8.4 months) (2.7-3.0months) survival Median 2.9 months 5.8 months 2.8 months time to(2.8-3.4 months) (3.9-8.4 months) (2.7-3.0 months) progression

(7) Logistic Regression Analysis about Independency of Selected GeneSignature

The independency of selected gene signature(CDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG) was investigated withlogistic regression analysis. In univariable logistic regressionanalysis, age and gene signature were significantly different andpositively correlated with sorafenib response. However, AFP wassignificantly different and negatively correlated with sorafenibresponse. In multivariable logistic analysis with significantlycorrelated factor with sorafenib response, AFP was not showedindependency with other factors, but age and gene signature wereindependent predictors of sorafenib response (Tables 6 and 7).

TABLE 6 Univariable logistic regression analysis se(co- variable ncoefficient efficient) z p-valiue Candidate genesCDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG 73 4.5850 0.8618 5.3211.03E−07 (low vs high) Clinicopathological feature Age (<55 years vs ≥55years) 73 1.5600 0.6833 2.283 0.0224 Gender (male vs female) 73 0.15250.6147 0.248 0.8040 HBV (absent vs present) 72 −0.1262 0.7465 −0.1690.8660 HCV (absent vs present) 72 −15.6700 1385.3778 −0.011 0.9910 TNMstage (I-II vs III-IV) 73 −1.4271 0.9534 −1.497 0.1340 BCLC (A-B vs C-D)73 0.7932 0.6976 1.137 0.2555 AFP (<100 ng/ml vs ≥100 ng/ml) 65 −1.31220.5655 −2.320 0.0203

TABLE 7 Multivariable logistic regression analysis variable Odds ratio95% Cl p-value CDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG 139.9012.72-1538.21 5.36E-05 (low vs high) Age (<55 years vs ≥55 years) 13.601.17-157.62 0.0368 AFP (<100 ng/ml vs ≥100 ng/ml) 1.05 0.15-7.31  0.9625

(8) Signal Transduction Pathway Analysis and High Interaction FrequencyRatio Genes Analysis for Selected Gene Signature

Signal transduction analysis through the meta-analysis showed that thegene signature of sorafenib responders was related highly to the 6pathways, i.e., pathways in cancer, human papillomavirus infection,proteoglycans in cancer, P13K-Akt signaling pathway, focal adhesion, andRas signaling pathway. And, 3 genes (EGFR, CTNNB1, SRC) were highinteraction frequency ratio genes related with both gene signature andsorafenib responders (Table 8).

TABLE 8 Gene signature related pathways and high interaction frequencyratio genes Name of pathway/High interaction frequency ratio genesPathway Pathways in cancer Human papillomavirus infection Proteoglycansin cancer PI3K-Akt signaling pathway Focal adhesion Ras signalingpathway High interaction frequency EGFR ratio genes CTNNB1 SRC

3. Discussion

The present inventors investigated mRNA expression of 730 genes withnCounter system in HCC tumors tissues and surrounding non-canceroustissues, and then identified 525 DEGs and 26 genes correlated withdisease control response of sorafenib. To apply influence of variousgenes to response, we analyzed logistic regression coefficient of eachgenes and weighted to corresponding gene expression values. Based onweighted gene expression value, the present inventors computed all ofgene combination and validated candidate gene signature with crossvalidation. Kaplan-Meier analysis, meta-analysis anduni-variable/multi-variable analysis were conducted.

With cross validation performance, we selectedCDH1_CHAD_EFNA2_FANCC_MAP2K1_MEN1_PBRM1_PPARG of 8-gene signature. The8-gene signature achieved 91.78% accuracy and 83.67% cross validationaccuracy with −2.483069 of cutoff. Meta-analysis revealed that EGFR andCTNNB1, SRC were interacting with 8 genes composing gene signature. Inthis result, EGFR confers resistance to sorafenib through Akt activationand SRC confers resistance to sorafenib through FAK-SRC signalingpathway as bypass track (Ezzoukhry Z, et al., EGFR activation is apotential determinant of primary resistance of hepatocellular carcinomacells to sorafenib. Int J Cancer 2012; 131: 2961-2969; Zhou Q, et al.,Activation of Focal Adhesion Kinase and Src Mediates Acquired SorafenibResistance in A549 Human Lung Adenocarcinoma Xenografts. J Pharmacol ExpTher 2017; 363: 428-443). Additionally, P13K-Akt signaling pathway andFocal adhesion pathway were observed interacting with the 8-genesignature. With this biological explanation supported, the 8-genesignature increased disease control rate from 28.77% to 85.71%. Inpredicted good responders, prognoses of OS and PFS showed improvementthan entire patients and predicted poor responders.

This 8-gene signature might be the best compromise between sorafenibeffectiveness and coverage of sorafenib treatment patients, becausesorafenib has low overall response rate. Therefore, the 8-gene signaturecan be usefully applied as a DCR biomarker for predicting the responseto sorafenib in HCC patients.

1. An analytical method of providing an information for diagnosis of ahepatocellular carcinoma patient having susceptibility to sorafenib, themethod of which comprises measuring expression levels of CDH1, CHAD,EFNA2, FANCC, MAP2K1, MEN1, PBRM1, and PPARG genes, in carcinoma tissuesamples which are externally discharged from the hepatocellularcarcinoma patient, respectively.
 2. The analytical method according toclaim 1, wherein the measuring is carried out by measuring the mRNAexpression levels of CDH1, CHAD, EFNA2, FANCC, MAP2K1, MEN1, PBRM1, andPPARG genes, respectively.